COLEPL and COLSLM: An Unsupervised WSD Approach to Multilingual Lexical Substitution, Tasks 2 and 3 SemEval 2010
نویسندگان
چکیده
In this paper, we present a word sense disambiguation (WSD) based system for multilingual lexical substitution. Our method depends on having a WSD system for English and an automatic word alignment method. Crucially the approach relies on having parallel corpora. For Task 2 (Sinha et al., 2009) we apply a supervised WSD system to derive the English word senses. For Task 3 (Lefever & Hoste, 2009), we apply an unsupervised approach to the training and test data. Both of our systems that participated in Task 2 achieve a decent ranking among the participating systems. For Task 3 we achieve the highest ranking on several of the language pairs: French, German and Italian.
منابع مشابه
USYD: WSD and Lexical Substitution using the Web1T corpus
This paper describes the University of Sydney’s WSD and Lexical Substitution systems for SemEval-2007. These systems are principally based on evaluating the substitutability of potential synonyms in the context of the target word. Substitutability is measured using Pointwise Mutual Information as obtained from the Web1T corpus. The WSD systems are supervised, while the Lexical Substitution syst...
متن کاملHIT-WSD: Using Search Engine for Multilingual Chinese-English Lexical Sample Task
We have participated in the Multilingual Chinese-English Lexical Sample Task of SemEval-2007. Our system disambiguates senses of Chinese words and finds the correct translation in English by using the web as WSD knowledge source. Since all the statistic data is obtained from search engine, the method is considered to be unsupervised and does not require any sense-tagged corpus.
متن کاملSemEval-2010 Task 3: Cross-Lingual Word Sense Disambiguation
We propose a multilingual unsupervised Word Sense Disambiguation (WSD) task for a sample of English nouns. Instead of providing manually sensetagged examples for each sense of a polysemous noun, our sense inventory is built up on the basis of the Europarl parallel corpus. The multilingual setup involves the translations of a given English polysemous noun in five supported languages, viz. Dutch,...
متن کاملUnsupervised Cross-Lingual Lexical Substitution
Cross-Lingual Lexical Substitution (CLLS) is the task that aims at providing for a target word in context, several alternative substitute words in another language. The proposed sets of translations may come from external resources or be extracted from textual data. In this paper, we apply for the first time an unsupervised cross-lingual WSD method to this task. The method exploits the results ...
متن کاملFCC: Modeling Probabilities with GIZA++ for Task 2 and 3 of SemEval-2
In this paper we present a naı̈ve approach to tackle the problem of cross-lingual WSD and cross-lingual lexical substitution which correspond to the Task #2 and #3 of the SemEval-2 competition. We used a bilingual statistical dictionary, which is calculated with Giza++ by using the EUROPARL parallel corpus, in order to calculate the probability of a source word to be translated to a target word ...
متن کامل